Chinese Paraphrases Acquiring Based on Random Walk N Steps

نویسندگان

  • Jun Ma
  • Yujie Zhang
  • Jinan Xu
  • Yufeng Chen
چکیده

Jun Ma, Yujie Zhang, Jinan Xu, Yufeng Chen (Beijing Jiaotong University, Beijing, 100044, China ) Abstract: Conventional “pivot” approach of acquiring paraphrases from bilingual corpus has limitations, where only candidated paraphrases within two steps are considered. In this paper, we propose a graph based model of acquiring paraphrases from phrases translation table. First, we describe a graph model based on Chinese-English phrases translation table, a random walk algorithm based on N number of steps and a confidence metric for the obtained paraphrases phrases. Furthermore, we augment the model to be able to integrate more language pairs, for instance, English-Japanese phrases translation table aiming at finding more potential Chinese paraphrases. We performed experiments on NTCIR Chinese-English and English-Japanese bilingual corpora and compared with the conventional method. The experimental results show that the proposed model acquired more paraphrases, and the performance was improved further after English-Japanese phrases translation was added into the graph model.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic Acquisition of Context-Specific Lexical Paraphrases

Lexical paraphrasing aims at acquiring word-level paraphrases. It is critical for many Natural Language Processing (NLP) applications, such as Question Answering (QA), Information Extraction (IE), and Machine Translation (MT). Since the meaning and usage of a word can vary in distinct contexts, different paraphrases should be acquired according to the contexts. However, most of the existing res...

متن کامل

Hitting the Right Paraphrases in Good Time

We present a random-walk-based approach to learning paraphrases from bilingual parallel corpora. The corpora are represented as a graph in which a node corresponds to a phrase, and an edge exists between two nodes if their corresponding phrases are aligned in a phrase table. We sample random walks to compute the average number of steps it takes to reach a ranking of paraphrases with better ones...

متن کامل

Generating Random Elements in SLn (Fq) by Random Transvections

Abstract. This paper studies a random walk based on random transvections in SLn(Fq,) and shows that, given 6 > 0, there is a constant c such that after n + c steps the walk is within a distance e from uniform and that after nc steps the walk is a distance at least 1 e from uniform. This paper uses results of Diaconis and Shahshahani to get the upper bound, uses results of Rudvalis to get the lo...

متن کامل

An Elementary Proof of the Hitting Time Theorem

In this note, we give an elementary proof of the random walk hitting time theorem, which states that, for a left-continuous random walk on Z starting at a nonnegative integer k, the conditional probability that the walk hits the origin for the first time at time n, given that it does hit zero at time n, is equal to k/n. Here, a walk is called left-continuous when its steps are bounded from belo...

متن کامل

Chinese Whispers: Cooperative Paraphrase Acquisition

We present a framework for the acquisition of sentential paraphrases based on crowdsourcing. The proposed method maximizes the lexical divergence between an original sentence s and its valid paraphrases by running a sequence of paraphrasing jobs carried out by a crowd of non-expert workers. Instead of collecting direct paraphrases of s, at each step of the sequence workers manipulate semantical...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016